RNA sequence analysis using covariance models.
نویسندگان
چکیده
We describe a general approach to several RNA sequence analysis problems using probabilistic models that flexibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models 'covariance models'. A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences in sequence databases. A model can be built automatically from an existing sequence alignment. We also describe an algorithm for learning a model and hence a consensus secondary structure from initially unaligned example sequences and no prior structural information. Models trained on unaligned tRNA examples correctly predict tRNA secondary structure and produce high-quality multiple alignments. The approach may be applied to any family of small RNA sequences.
منابع مشابه
Running title: Covariance models of RNA RNA Sequence Analysis Using Covariance Models
We describe a general approach to several RNA sequence analysis problems using probabilistic models that exibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models \covariance models". A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences in ...
متن کاملCMfinder - a covariance model based RNA motif finding algorithm
MOTIVATION The recent discoveries of large numbers of non-coding RNAs and computational advances in genome-scale RNA search create a need for tools for automatic, high quality identification and characterization of conserved RNA motifs that can be readily used for database search. Previous tools fall short of this goal. RESULTS CMfinder is a new tool to predict RNA motifs in unaligned sequenc...
متن کاملThermodynamic matchers for the construction of the cuckoo RNA family
RNA family models describe classes of functionally related, non-coding RNAs based on sequence and structure conservation. The most important method for modeling RNA families is the use of covariance models, which are stochastic models that serve in the discovery of yet unknown, homologous RNAs. However, the performance of covariance models in finding remote homologs is poor for RNA families wit...
متن کاملRfam: an RNA family database
Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against a library of covariance models, and view multiple sequence alignments and family annotation. The ...
متن کاملModeling the Thermoproteaceae RNase P RNA
The RNA component of the RNase P complex is found throughout most branches of the tree of life and is principally responsible for removing the 5' leader sequence from pre-tRNA transcripts during tRNA maturation. RNase P RNA has a number of universal core features, however variations in sequence and structure found in homologs across the tree of life require multiple Rfam covariance search model...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Nucleic acids research
دوره 22 11 شماره
صفحات -
تاریخ انتشار 1994